Session A-7

ML Security

Conference
8:30 AM — 10:00 AM EDT
Local
May 19 Fri, 8:30 AM — 10:00 AM EDT
Location
Babbio 122

Mixup Training for Generative Models to Defend Membership Inference Attacks

Zhe Ji, Qiansiqi Hu and Liyao Xiang (Shanghai Jiao Tong University, China); Chenghu Zhou (Chinese Academy of Sciences, China)

0
With the popularity of machine learning, it has been a growing concern on the trained model revealing the private information of the training data. Membership inference attack (MIA) poses one of the threats by inferring whether a given sample participates in the training of the target model. Although MIA has been intensively studied for discriminative models, it has been seldom investigated for generative models, nor its defense. In this work, we propose a mixup training method for generative adversarial networks (GANs) as a defense against MIAs. Specifically, the original training data is replaced with their interpolations so that GANs would never overfit the original data. The intriguing part is an analysis from the hypothesis test perspective to theoretically prove our method could mitigate the AUC of the strongest likelihood ratio attack. Experimental results support that mixup training successfully defends the state-of-the-art MIAs for generative models, yet without model performance degradation or any additional training efforts, showing great promise to be deployed in practice.
Speaker Zhe Ji (Shanghai Jiao Tong University)

Zhe Ji is a master student at Shanghai Jiao Tong University. He graduated from Shanghai Jiao Tong University with a bachelor's degree in computer science and technology. His current research interests mainly focus on privacy issues in machine learning.


Spotting Deep Neural Network Vulnerabilities in Mobile Traffic Forecasting with an Explainable AI Lens

Serly Moghadas (IMDEA Networks, Spain); Claudio Fiandrino and Alan Collet (IMDEA Networks Institute, Spain); Giulia Attanasio (IMDEA Networks, Spain); Marco Fiore and Joerg Widmer (IMDEA Networks Institute, Spain)

0
The ability to forecast mobile traffic patterns at large is key to resource management for mobile network operators and local authorities. Several Deep Neural Networks (DNN) have been designed to capture the complex spatio-temporal characteristics of mobile traffic patterns at scale. These models are complex black boxes whose decisions are inherently hard to explain. Even worse, they have been proven vulnerable
to adversarial attacks which undermine their applicability in production networks. In this paper, we conduct a first in-depth study of the vulnerabilities of DNNs for large-scale mobile traffic forecasting. We propose DeExp, a new tool that leverages EXplainable Artificial Intelligence (XAI) to understand which Base Stations (BSs) are more influential for forecasting from a spatio-temporal perspective. This is challenging as existing XAI techniques are usually applied to computer vision or natural language processing and need to be adapted to the mobile network context. Upon identifying the more influential BSs, we run state-of-the art Adversarial Machine Learning (AML) techniques on those BSs and measure the accuracy degradation of the predictors. Extensive evaluations with real-world mobile traffic traces pinpoint that attacking BSs relevant to the predictor significantly degrades its accuracy across all the scenarios.
Speaker Claudio Fiandrino (IMDEA Networks Institute)

Claudio Fiandrino is a senior researcher at IMDEA Networks Institute. He obtained his Ph.D. degree at the University of Luxembourg in 2016. Claudio has received numerous awards for his research, including a Fulbright scholarship in 2022, a 5-year long Spanish Juan de la Cierva grants and several Best Paper Awards. He is member of IEEE and ACM, serves in the Technical Program Committee (TPC) of several international IEEE and ACM conferences and regularly participates in the organization of events. Claudio is member of the Editorial Board of IEEE Networking Letters and Chair of the IEEE ComSoc EMEA Awards Committee. His primary research interests include explainable and robust AI for mobile networks, next generation mobile networks, and multi-access edge computing.


FeatureSpy: Detecting Learning-Content Attacks via Feature Inspection in Secure Deduplicated Storage

Jingwei Li (University of Electronic Science and Technology of China, China); Yanjing Ren and Patrick Pak-Ching Lee (The Chinese University of Hong Kong, Hong Kong); Yuyu Wang (University of Electronic Science and Technology of China, China); Ting Chen (University of Electronic Science and Technology of China (UESTC), China); Xiaosong Zhang (University of Electronic Science and Technology of China, China)

1
Secure deduplicated storage is a critical paradigm for cloud storage outsourcing to achieve both operational cost savings (via deduplication) and outsourced data confidentiality (via encryption). However, existing secure deduplicated storage designs are vulnerable to learning-content attacks, in which malicious clients can infer the sensitive contents of outsourced data by monitoring the deduplication pattern. We show via a simple case study that learning-content attacks are indeed feasible and can infer sensitive information in short time under a real cloud setting. To this end, we present FeatureSpy, a secure deduplicated storage system that effectively detects learning-content attacks based on the observation that such attacks often generate a large volume of similar data. FeatureSpy builds on two core design elements, namely (i) similarity-preserving encryption that supports similarity detection on encrypted chunks and (ii) shielded attack detection that leverages Intel SGX to accurately detect learning-content attacks without being readily evaded by adversaries. Trace-driven experiments on real-world and synthetic datasets show that our FeatureSpy prototype achieves high accuracy and low performance overhead in attack detection.
Speaker Patrick P. C. Lee (The Chinese University of Hong Kong)

Patrick Lee is now a Professor of the Department of Computer Science and Engineering at the Chinese University of Hong Kong. His research interests are in storage systems, distributed systems and networks, and cloud computing.


Fast Generation-Based Gradient Leakage Attacks against Highly Compressed Gradients

Dongyun Xue, Haomiao Yang, Mengyu Ge and Jingwei Li (University of Electronic Science and Technology of China, China); Guowen Xu (Nanyang Technological University, Singapore); Hongwei Li (University of Electronic Science and Technology of China, China)

1
Federated learning (FL) is a distributed machine learning technology that preserves data privacy. However, recent gradient leakage attacks (GLA) can reconstruct private training data from public gradients. Nevertheless, these attacks either require modification of the FL model (analytics-based) or take a long time to converge (optimization-based) and fail in dealing with highly compressed gradients in practical FL systems. In this paper, we pioneer a generation-based GLA method called FGLA that can reconstruct batches of user data, forgoing the optimization process. Specifically, we design a feature separation technique that extracts the feature of each data in a batch and then generates user data directly. Extensive experiments on multiple image datasets demonstrate that FGLA can reconstruct user images in milliseconds with a batch size of 512 from highly compressed gradients (0.8\% compression ratio or higher), thus substantially outperforming state-of-the-art methods.
Speaker Dongyun Xue

Dongyun Xue is a graduate student at the University of Electronic Science and Technology of China, with a major research focus on artificial intelligence security.


Session Chair

Qiben Yan

Session A-8

Internet/Web Security

Conference
10:30 AM — 12:00 PM EDT
Local
May 19 Fri, 10:30 AM — 12:00 PM EDT
Location
Babbio 122

De-anonymization Attacks on Metaverse

Yan Meng, Yuxia Zhan, Jiachun Li, Suguo Du and Haojin Zhu (Shanghai Jiao Tong University, China); Sherman Shen (University of Waterloo, Canada)

0
Virtual reality (VR) provides users with an immersive experience as a fundamental technology in the metaverse. One of the most promising properties of VR is that users' identities can be protected by changing their physical world appearances into arbitrary virtual avatars. However, recent proposed de-anonymization attacks demonstrate the feasibility of recognizing the user's identity behind the VR avatar's masking. In this paper, we propose AvatarHunter, a non-intrusive and user-unconscious de-anonymization attack based on victims' inherent movement signatures. AvatarHunter imperceptibly collects the victim avatar's gait information via recording videos from multiple views in the VR scenario without requiring any permission. A Unity-based feature extractor is designed that preserves the avatar's movement signature while immune to the avatar's appearance changes. Real-world experiments are conducted in VRChat, one of the most popular VR applications. The experimental results demonstrate that AvatarHunter can achieve attack success rates of 92.1% and 66.9% in closed-world and open-world avatar settings, respectively, which are much better than existing works.
Speaker Yan Meng (Shanghai Jiao Tong University)

Yan Meng is a Research Assistant Professor in the Department of Computer Science and Engineering at Shanghai Jiao Tong University. He received his Ph.D. degree from the Shanghai Jiao Tong University (2016–2022) and his B.Eng. degree from the Huazhong University of Science and Technology (2012–2016). His research focuses on IoT security, voice interface security, and privacy policy analysis. He has published 25 research papers, mainly in INFOCOM, CCS, USENIX Security, TDSC, and TMC. He won the Best Paper Award from the SocialSec in 2015. He is the recipient of the 2022 ACM China Excellent Doctoral Dissertation Award.


DisProTrack: Distributed Provenance Tracking over Serverless Applications

Utkalika Satapathy and Rishabh Thakur (Indian Institute of Technology Kharagpur, India); Subhrendu Chattopadhyay (Institute for Developemnt and Research in Banking Technologies, India); Sandip Chakraborty (Indian Institute of Technology Kharagpur, India)

1
Provenance tracking has been widely used in the recent literature to debug system vulnerabilities and find the root causes behind faults, errors, or crashes over a running system. However, the existing approaches primarily developed graph-based models for provenance tracking over monolithic applications running directly over the operating system kernel. In contrast, the modern DevOps-based service-oriented architecture relies on distributed platforms, like serverless computing that uses container-based sandboxing over the kernel. Provenance tracking over such a distributed micro-service architecture is challenging, as the application and system logs are generated asynchronously and follow heterogeneous nomenclature and logging formats. This paper develops a novel approach to combining system and micro-services logs together to generate a Universal Provenance Graph (UPG) that can be used for provenance tracking over serverless architecture. We develop a Loadable Kernel Module (LKM) for runtime unit identification over the logs by intercepting the system calls with the help from the control flow graphs over the static application binaries. Finally, we design a regular expression-based log optimization method for reverse query parsing over the generated UPG. A thorough evaluation of the proposed UPG model with different benchmarked serverless applications shows the system's effectiveness.
Speaker Utkalika Satapathy(Indian Institute of Technology, Kharagpur, India)

I am a Research Scholar in the Department of Computer Science and Engineering at the Indian Institute of Information Technology (IIT) Kharagpur, India. Under the supervision of Prof. Sandip Chakraborty, I am pursuing my Ph.D.

In addition, I am a member of the research group Ubiquitous Networked Systems Lab (UbiNet) at IIT Kharagpur, India. As for my research interests, they revolve around the areas of Systems, Provenance Tracking, and Distributed systems.


ASTrack: Automatic detection and removal of web tracking code with minimal functionality loss

Ismael Castell-Uroz (Universitat Politècnica de Catalunya, Spain); Kensuke Fukuda (National Institute of Informatics, Japan); Pere Barlet-Ros (Universitat Politècnica de Catalunya, Spain)

0
Recent advances in web technologies make it more difficult than ever to detect and block web tracking systems. In this work, we propose ASTrack, a novel approach to web tracking detection and removal. ASTrack uses an abstraction of the code structure based on Abstract Syntax Trees to selectively identify web tracking functionality shared across multiple web services. This new methodology allows us to: (i) effectively detect web tracking code even when using evasion techniques (e.g., obfuscation, minification or webpackaging), and (ii) to safely remove those portions of code related to tracking purposes without affecting the legitimate functionality of the website. Our evaluation with the top 10K most popular Internet domains shows that ASTrack can detect web tracking with high precision (98%), while discovering about 50K tracking code pieces and more than 3,400 new tracking URLs not previously recognized by most popular privacy-preserving tools (e.g., uBlock Origin). Moreover, ASTrack achieved a 36% reduction of functionality loss in comparison with the filter lists, one of the safest options available. Using a novel methodology that combines computer vision and manual inspection we estimate that full functionality is preserved in more than 97% of the websites.
Speaker Ismael Castell-Uroz (Universitat Politècnica de Catalunya)

Ismael Castell-Uroz is a Ph.D. student at the Computer Architecture Department of the Universitat Politècnica de Catalunya (UPC), Barcelona, Spain, where he received the B.Sc. degree in Computer Science in 2008 and the M.Sc. degree in Computer Architecture, Networks, and Systems in 2010. He has several years of experience in network and system administration and currently holds a Projects Scholarship at UPC. His expertise and research interest are in computer networks, especially in the field of network monitoring, anomaly detection, internet privacy and web tracking.


Secure Middlebox Channel over TLS and its Resiliency against Middlebox Compromise

Kentaro Kita, Junji Takemasa, Yuki Koizumi and Toru Hasegawa (Osaka University, Japan)

0
A large portion of Internet traffic passes through middleboxes that read or modify messages. However, as more traffic is protected with TLS, middleboxes are becoming unable to provide their functions. To leverage middlebox functionality while preserving communication security, secure middlebox channel protocols have been designed as extensions of TLS. A key idea is that the endpoints explicitly incorporate middleboxes into the TLS handshake and grant each middlebox either the read or the write permission for their messages. Because each middlebox has the least data access privilege, these protocols are resilient against the compromise of a single middlebox. However, the existing studies have not comprehensively analyzed the communication security under the scenarios where multiple middleboxes are compromised. In this paper, we present novel attacks that break the security of the existing protocols under such scenarios and then modify maTLS, the state-of-the-art protocol, so that all the attacks are prevented with marginal overhead.
Speaker Kentaro Kita (Osaka University)

Kentaro Kita received his Ph.D. in information science from Osaka University. His research interests include privacy, anonymity, security, and future networking architecture.


Session Chair

Ning Zhang

Session A-9

IoT

Conference
1:30 PM — 3:00 PM EDT
Local
May 19 Fri, 1:30 PM — 3:00 PM EDT
Location
Babbio 122

Enable Batteryless Flex-sensors via RFID Tags

Mengning Li (North Carolina State University, USA)

0
Detection of flex-angle of objects or human bodies can benefit various scenarios such as robotic arm control, medical rehabilitation, and deformation detection. However, two common solutions including flex sensors and computer vision methods inevitably have the following limitations: (i) the battery-powered flex sensors have limited system lifetime; (ii) computer vision methods fail in Non-Line-of-Sight (NLoS) scenarios. To overcome these limitations, we for the first time propose an RFID-based Flex-sensor (RFlexor) system to enable flex-angle detection in a batteryless manner for NLoS. The basic insight of RFlexor is that flexing tag will affect the tag hardware characteristics and further change the tag phase and Received Signal Strength Indicator (RSSI) received by the reader antenna. To figure out the relationship between phase/RSSI and its flex-angle, we train a multi-input AI model in advance. By feeding the processed data into the model, we can accurately detect the tag flex-angles. We use Commercial-Off-The-Shelf (COTS) RFID devices to implement the RFlexor system. Extensive experiments reveal that RFlexor can achieve fine-grained flex-angle detection results, e.g., the detection error is less than 10 degrees with a probability higher than 90% at most conditions; and the average detection error is always less than 10 degrees across all experiments.
Speaker Mengning Li

Mengning Li is a first-year Ph.D. student at North Carolina State University, where she is fortunate to be advised by Prof. Wenye Wang. Her research interest mainly lies in wireless sensing.


TomoID: A Scalable Approach to Device Free Indoor Localization via RFID Tomography

Yang-Hsi Su and Jingliang Ren (University of Michigan, USA); Zi Qian (Tsinghua University, China); David Fouhey and Alanson Sample (University of Michigan, USA)

0
Device-free localization methods allow users to benefit from location-aware services without the need to carry a transponder. However, conventional radio sensing approaches using active wireless devices require wired power or continual battery maintenance, limiting deployability. We present TomoID, a real-time multi-user UHF RFID tomographic localization system that uses low-level communication channel parameters such as RSSI, RF Phase, and Read Rate to create probability heatmaps of users' locations. The heatmaps are passed to our custom-designed signal processing and machine learning pipeline to robustly predict users' locations. Results show that TomoID is highly accurate, with an average mean error of 17.1 cm for a stationary user and 18.9 cm when users are walking. With multi-user tracking, results showing an average mean error of 70.0 cm for five individuals in constant motion. Importantly, TomoID is specifically designed to work in real-world multipath-rich indoor environments. Our signal processing and machine learning pipeline allows a pre-trained localization model to be applied to new environments of different shapes and sizes, while maintaining good accuracy sufficient for indoor user localization and tracking. Ultimately, TomoID enables a scalable, easily deployable, and minimally intrusive method for locating uninstrumented users in indoor environments.
Speaker Yang-Hsi Su (University of Michigan - Ann Arbor)

A 3rd year PhD student in the Interactive Sensing and Computing Lab lead by Prof. Alanson Sample at the University of Michigan. Mainly focuses on RF sensing and RF localization.


Extracting Spatial Information of IoT Device Events for Smart Home Safety Monitoring

Yinxin Wan, Xuanli Lin, Kuai Xu, Feng Wang and Guoliang Xue (Arizona State University, USA)

0
Smart home IoT devices have been widely deployed and connected to many home networks for various applications such as intelligent home automation, connected healthcare, and security surveillance. The informative network traffic trace generated by IoT devices has enabled recent research advances on smart home network measurement. However, due to the cloud-based communication model of smart home IoT devices and lack of traffic data collected at the could end, little effort has been devoted to extracting the spatial information of IoT device events to determine where a device event is triggered. In this paper, we examine why extracting the device events' spatial information is challenging by analyzing the communication model of the smart home IoT system. We then propose a system named IoTDuet for determining whether a device event is triggered locally or remotely by utilizing the fact that the controlling device such as smartphones and tablets always communicate with cloud servers with static domain names when issuing commands from the home network. We further show the importance of extracting the critical spatial information of IoT device events by exploring its applications in smart home safety monitoring.
Speaker Yinxin Wan (Arizona State University)

Yinxin Wan is a final-year Ph.D. candidate majoring in Computer Science at Arizona State University. He obtained his B.E. degree from the University of Science and Technology of China in 2018. His research interests include cybersecurity, IoT, network measurement, and data-driven networked systems.


RT-BLE: Real-time Multi-Connection Scheduling for Bluetooth Low Energy

Yeming Li and Jiamei Lv (Zhejiang University, China); Borui Li (Southeast University, China); Wei Dong (Zhejiang University, China)

0
Bluetooth Low Energy (BLE) is one of the popular wireless protocols to build IoT applications. However, the BLE suffers from three major issues that make it unable to provide reliable service to time-critical IoT applications. First, the BLE operates in the crowded 2.4GHz frequency band, which can lead to a high packet loss rate. Second, it is common for one device to connect with multiple BLE Peripherals, which can lead to severe collision issue. Third, there is a long delay to re-allocate time resource. In this paper, we propose RT-BLE: a real-time multi-connection scheduling scheme for BLE. We first formulate the BLE transmission latency in noisy RF environments considering the BLE retransmission mechanism. With this, RT-BLE can get a set of initial connection parameters. Then, RT-BLE uses collision tree based time resource scheduling technology to efficiently manage time resource. Finally, we propose a subrating-based fast connection re-scheduling method to update the connection parameters and the position of anchor points. The result shows RT-BLE can provide reliable service and the error of our model is less than 0.69%. Compare with existing works, the re-scheduling delay is reduced by up to 86.25% and the capacity is up to 4.33x higher.
Speaker Yeming Li (Zhejiang University)

Yeming Li received the B.S. degree in computer science from Zhejiang University of Technoligy, in 2020.

He is currently pursuing the Ph.D. degree with Zhejiang University.

His research interests include Internet of Things and wireless protocols.


Session Chair

Gianluca Rizzo

Session A-10

Distributed Learning

Conference
3:30 PM — 5:00 PM EDT
Local
May 19 Fri, 3:30 PM — 5:00 PM EDT
Location
Babbio 122

DIAMOND: Taming Sample and Communication Complexities in Decentralized Bilevel Optimization

Peiwen Qiu, Yining Li and Zhuqing Liu (The Ohio State University, USA); Prashant Khanduri (University of Minnesota, USA); Jia Liu and Ness B. Shroff (The Ohio State University, USA); Elizabeth Serena Bentley (AFRL, USA); Kurt Turck (United States Air Force Research Labs, USA)

1
Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e.g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks. However, to work with the limited computation and communication capabilities of edge networks, a major challenge in developing decentralized bilevel optimization techniques is to lower sample and communication complexities. This motivates us to develop a new decentralized bilevel optimization called DIAMOND (decentralized single-timescale stochastic approximation with momentum and gradient-tracking). The contributions of this paper are as follows: i) our DIAMOND algorithm adopts a single-loop structure rather than following the natural double-loop structure of bilevel optimization, which offers low computation and implementation complexity; ii) compared to existing approaches, the DIAMOND algorithm does not require any full gradient evaluations, which further reduces both sample and computational complexities; iii) through a careful integration of momentum information and gradient tracking techniques, we show that the DIAMOND algorithm enjoys O(ε^(-3/2)) in sample and communication complexities for achieving an ε-stationary solution, both of which are independent of the dataset sizes and significantly outperform existing works. Extensive experiments also verify our theoretical findings.
Speaker Peiwen Qiu (The Ohio State University)

Peiwen Qiu is a Ph.D. student at The Ohio State University under the supervision of Prof. Jia (Kevin) Liu. Her research interests include but are not limited to optimization theory and algorithms for bilevel optimization, decentralized bilevel optimization and federated learning.


PipeMoE: Accelerating Mixture-of-Experts through Adaptive Pipelining

Shaohuai Shi (Harbin Institute of Technology, Shenzhen, China); Xinglin Pan and Xiaowen Chu (Hong Kong Baptist University, Hong Kong); Bo Li (Hong Kong University of Science and Technology, Hong Kong)

0
Large models have attracted much attention in the AI area. The sparsely activated mixture-of-experts (MoE) technique pushes the model size to a trillion-level with a sub-linear increase of computations as an MoE layer can be equipped with many separate experts, but only one or two experts need to be trained for each input data. However, the feature of dynamically activating experts of MoE introduces extensive communications in distributed training. In this work, we propose PipeMoE to adaptively pipeline the communications and computations in MoE to maximally hide the communication time. Specifically, we first identify the root reason why a higher pipeline degree does not always achieve better performance in training MoE models. Then we formulate an optimization problem that aims to minimize the training iteration time. To solve this problem, we build performance models for computation and communication tasks in MoE and develop an optimal solution to determine the pipeline degree such that the iteration time is minimal. We conduct extensive experiments with 174 typical MoE layers and two real-world NLP models on a 64-GPU cluster. Experimental results show that our PipeMoE almost always chooses the best pipeline degree and outperforms state-of-the-art MoE training systems by 5%-77% in training time.
Speaker Shaohuai Shi

Shaohuai Shi is currently an Assistant Professor at the School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen. Previously, he was a Research Assistant Professor at the Department of Computer Science & Engineering of The Hong Kong University of Science and Technology. His current research focus is distributed machine learning systems.


Accelerating Distributed K-FAC with Efficient Collective Communication and Scheduling

Lin Zhang (Hong Kong University of Science and Technology, Hong Kong); Shaohuai Shi (Harbin Institute of Technology, Shenzhen, China); Bo Li (Hong Kong University of Science and Technology, Hong Kong)

0
Kronecker-factored approximate curvature (K-FAC) has been shown to achieve faster convergence than SGD in training deep neural networks. However, existing distributed K-FAC (D-KFAC) relies on the all-reduce collective for communications and scheduling, which incurs excessive communications in each iteration. In this work, we propose a new form of D-KFAC with a reduce-based alternative to eliminate redundant communications. This poses new challenges and opportunities in that the reduce collective requires a root worker to collect the results, which considerably complicates the communication scheduling. To this end, we formulate an optimization problem that determines tensor fusion and tensor placement simultaneously aiming to minimize the training iteration time. We develop novel communication scheduling strategies and propose a placement-aware D-KFAC (PAD-KFAC) training algorithm, which further improves communication efficiency. Our experimental results on a 64-GPU cluster interconnected with 10Gb/s and 100Gb/s Ethernet show that our PAD-KFAC can achieve an average of 27% and 17% improvement over state-of-the-art D-KFAC methods, respectively.
Speaker Lin Zhang (Hong Kong University of Science and Technology)

Lin Zhang is currently pursuing the Ph.D. degree in the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology. His research interests include machine learning systems and algorithms, with a special focus on distributed DNNs training, and second-order optimization methods.


DAGC: Data-aware Adaptive Gradient Compression

Rongwei Lu (Tsinghua University, China); Jiajun Song (Dalian University of Technology, China); Bin Chen (Harbin Institute of Technology, Shenzhen, China); Laizhong Cui (Shenzhen University, China); Zhi Wang (Tsinghua University, China)

1
Gradient compression algorithms are widely used to alleviate the communication bottleneck in distributed ML. However, existing gradient compression algorithms suffer from accuracy degradation in Non-IID scenarios, because a uniform compression scheme is used to compress gradients at workers with different data distributions and volumes, since workers with larger volumes of data are forced to adapt to the same aggressive compression ratios as others. Assigning different compression ratios to workers with different data distributions and volumes is thus a promising solution.
In this study, we first derive a function from capturing the correlation between the number of training iterations for a model to converge to the same accuracy, and the compression ratios at different workers; This function particularly shows that workers with larger data volumes should be assigned with higher compression ratios to guarantee better accuracy. Then, we formulate the assignment of compression ratios to the workers as an n-variables chi-square nonlinear optimization problem under fixed and limited total communication constrain. We propose an adaptive gradients compression strategy called DAGC, which assigns each worker a different compression ratio according to their data volumes. Our experiments confirm that DAGC can achieve better performance facing highly imbalanced data volume distribution and restricted communication.
Speaker Rongwei Lu (Tsinghua University)

Rongwei is a second-year Master's student in Computer Technology at Tsinghua University, advised by Prof. Zhi Wang. His research interests are to accelerating machine learning training from communication and computation. He was a research intern in System Research Group of MSRA. This paper is his first published paper. 


Session Chair

Yanjiao Chen


Gold Sponsor


Gold Sponsor


Bronze Sponsor


Student Travel Grants


Student Travel Grants


Local Organizer

Made with in Toronto · Privacy Policy · INFOCOM 2020 · INFOCOM 2021 · INFOCOM 2022 · © 2023 Duetone Corp.